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Abstract 

It is shown that the capacity of a classical-quantum channel with ar- 
bitrary (possibly mixed) states equals to the maximum of the entropy 
bound with respect to all apriori distributions. This completes the recent 
result of Hausladen, Jozsa, Schumacher, Westmoreland and Wooters Q], 
who proved the equality for the pure state channel. 



1. Information and capacity for quantum channel. We start by repeating 
some definitions and results from ||. Let TL be a d-dimensional Hilbert space. 
We denote D = {1, d}. A simple quantum communication channel (classical- 
quantum channel in terminology of ||) consists of the input alphabet A — 
{1, a} and a mapping i — > Si from the input alphabet to the set of quantum 
states in TL. A quantum state is a density operator (d. o.), i. e. positive operator 
S in TL with unit trace, TrS — 1. Coding is a probability distribution tt = {iTi} 
on A. Decoding is a resolution of identity in TL, i. e. a family X — {Xj} of 
positive operators in TL satisfying ^ . Xj = I, where I is the unit operator in 
TL. The index j runs through some finite output alphabet, which is not fixed 
here. The conditional probability of the output j if the input was i equals to 
P(j\i) — TrSiXj. The Shannon information is given by the classical formula 

(in what follows we use the binary logarithms). 

In the same way we can consider the product channel in TL® n = 7i <Z> ... <g> Ti. 
with the input alphabet A n consisting of words u = (i\ 1 ...,i n ) of length n, with 
the d. o. 

S u = S it ® ... ® S in (2) 

corresponding to the word u. If 7r is a probability distribution on A n and X is 
a resolution of identity in TL® n , we define the information quantity / n (7r, X) by 
the formula similar to (1). Defining 

C„ = sup J n (7T, X), 

7T,X 
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we have the property of superadditivity C' n + C m < C n + m , hence the following 
limit exists 

C = lim C n /n, (3) 

n — >oc 

and is called the capacity of the initial channel ||. This definition is justified 
by the fact easily deduced from the classical Shannon's coding theorem, that 
C is the least upper bound of rate (bits/symbol) of information which can be 
transmitted with asymptotically vanishing error. More precisely, we call by 
code of size N a sequence (u\,X\), (u n ,X n ), where Uf. are words of length 
n, and {Xk} is a family of positive operators in Ti® n , satisfying J^jLi Xj < I- 
Defining Xq = I — Y^,j= 1 Xj , we have a resolution of identity in TL® ™ . An output 
fc(l < k < N) means decision that the word Uk was transmitted, while the 
output is interpreted as evasion of any decision. The average error probability 
for such a code is 

1 N 

k=l 

Let us denote p(n, N) the minimum of this error probability with respect to all 
codes of the size N with words of length n. Then 

p{ ni 2 n(C ~ S) ) ->0 and p{nX (C+S) ) AO, (4) 

where 8 > 0, if n — » oo. The same holds for the minimum of the maximal (with 
respect to k) error probability, which does not presume any apriori probabilities 
for the words (see [[|, &). 

2. The entropy bound. The main result of was a lower bound for 
C demonstrating the possibility of the inequality C > C\ and implying strict 
superadditivity of the sequence C n . This is in sharp contrast with the situation 
for the corresponding classical memoriless channel, for which C n = nC\ and 
hence C = C\, and is just another manifestation of the quantum nonseparability. 
This fact is in a sense dual to the existence of EPR correlations: the latter 
are due to entangled states and hold for disentangled measurements while the 
superadditivity is due to entangled measurements and holds for disentangled 
states. The inequality C ^ C\ raised the problem of the actual value of the 
capacity C. 

Let H(S) — — TrSlogS 1 be the von Neumann entropy of ad. o. S and let tt = 
{7Tj} be an apriori distribution on A. Let us denote S = X^eA Ki^ii = 
J2 ieA TnH(Si) and AH(tt) = H(S) - H(S { .)). The entropy bound § combined 
with an additivity property proved in Appendix implies C < maxj AH(w). In 
[Ql a conjecture was made that in fact this might be an equality. Recently 
Hausladen, Jozsa, Schumacher, Westmoreland and Wooters M proved this in 
the case of pure states Si (apparently not knowing about the paper p]). The 
problem for the case of general (possibly mixed) states was left open and is 
the subject of our present work. The main result is the estimate for the error 
probability implying converse inequality C > max T AH(n). Thus we have 
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Theorem. The capacity of the quantum communication channel with 
arbitrary signal states Si is given by 

C = maxpt^ nA) - ]T TTiHiSi)}, (5) 

confirming the old physical wisdom according to which the entropy bound was 
used to evaluate the quantum capacity || . 

The key points of the proof are the idea of projection onto the typical sub- 
space due to H) modified here for the case of mixed states, and the estimate 
for the error probability, which is substantially more complicated than the esti- 
mate for pure states given already in Q and a similar estimate from 

3. The typical subspaces of density operators. Let S = Ylj^D X j\ e j > 
< Cj | be the spectral decomposition of the d. o. then the spectral decompo- 
sition of S® n = S® ...(g) S is 

Jer>™ 

where J = (ji, j„), Xj = X jl ■ ... ■ Xj n , \ej >= \e n > ® \e jn > . 
Following we introduce the spectral projector onto the typical subspace of 
the d. o. S® n as 

P=£> J ><e J | ) (6) 
Jes 

where B = {J : 2-^"^+^ < Xj < 2^ H ^' S ^} C D n . A sequence J G B 
is "typical" for a probability distribution on D n given by eigenvalues Xj of the 
d. o. S® n in the sense of classical information theory (see e. g. Q). It follows 
that for fixed small positive e, 5 and all n > rii(7r, e, 5) 

TrS®"(I-P) < e. (7) 

Indeed, TrS® n P is equal to the probability 

P{J EB} = P{n[H(S) -S]< -logAj < n[H(S) + 6}} 

n 

= P{\n- 1 Y,logX jl +H(S)\<5}, 

which tends to 1 as n — > oo, according to the Law of Large Numbers, since 
H(S) = -MlogA ( .). 

The next step is a developement of this idea necessary to prove the Theorem 
for mixed states. Let Si — X )\ e j >< e j\ ^ e tne s P ec tral decomposition 

of the d. o. Si. Let u = (ii,...,i n ) be a word of the input alphabet and 
S u = Si^ (g> ... (g> Si n be the corresponding d. o. Its spectral decomposition is 

Su= £ A "i e " >< e "i> 

Jen" 
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where A^ = A*-* • ... • A*™, \e u 3 >= \e % \ ><S>... ® |e*™ > • We introduce the spectral 
projector onto the typical subspace of S u as 

where B u ~ {J : 2-™[ 5 ( s (-))+< 5 l < A} < 2-™[#(' s <-))-< 5 ]}. 

Let on the set of all words A n the following probability distribution be 
defined 

P{u = (ii, ...,i„)} = 7r n • ... • n in . (9) 
Then for fixed small positive e, (5 and all n > 712(71", e, 5) 

MTyS u (I - P u ) < e. (10) 

Indeed, consider the sequence of independent trials with the outcomes ii,ji;l — 
l,...,n where the probability of the outcome in each trial is equal to t^A*. 
Then 

MTrS u P u = P{ J e B u } - P{n[tf(S ( .)) - 5] < -logA} < n[H{S { .)) + 5]} 

n 

= P{|n- 1 £logA};+ J ff(S ( . ) )|<<5}, 
i=i 

which tends to 1 as n — > oo, according to the Law of Large Numbers, since 
H(S^) — — MlogA^j. In what follows we put n(n,e,S) — max{ni(7r, e, S), 
n 2 (n,e,S)}. 

4. The choice of the suboptimal decision rule. Let u\,...,Un be a 
sequence of words. To simplify notations we denote the words by their numbers 
1,...,N. Put 

N N 

x u = (J2 pPu'py^pPuPiY, ppwpyK (n) 

u' = l u'=l 

where denotes generalized inverse of the operator X' i. e. operator equal 
on KcrX and (X?)- 1 on KevX^. Then J2u=i X u < I- Put \e u } >= P\e u j > 
where P is defined by (6), then 

*« = (E E l^'><^'|)-i E I^X^KE E Ie}'><e}'|)-i 
u '=i Jes„/ ,/es u «'=i Jefi,,/ 

By denoting 

AT 

«(«,J),«J') =< e"l(E E |ej"><ej"|) _i e}', >, 

u'=l JGB„// 
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and taking into account that X u — PX U P, the average error probability corre- 
sponding to the choice (11) can be written as 



1 N 

atEi 1 - E E a> (u ,,), m i 2 ]- (12) 



u=l JeD" J'EB U 



5. The estimate for the error probability. Taking into account that 
j£D n A " = 1 ano - omitting some nonpositive terms, we see that 



P -<^DE m-^ u ,j Uu , J} )+ E A "]- ( 13 ) 

Let us denote 

7(u,J),( U ',J') =< >=< e}\Pei > (14) 

and introduce the Gram matrix 

r = [70, ,/),(«', j')], 

where J £ B u , J' G £>„' and u, v! — 1,...,N. Then 

T 2 = [a(u,j),(u',J>)\- 
In particular, a( U)J ) )(tt ^ < 7(«,J),(u./) < 1- Tncn irom (13) 

1 - 

p - ^ ]v Et 2 E A "(! - «kj),<«.j)) + E A *]- ( 15 ) 

u=l JGB„ J£.B U 

By introducing the diagonal matrix A = diag[Aj] and denoting by E the unit 
matrix and the trace of matrices by Sp as distinct from the trace of operators 
in Hilbert space, we have 

N 

2 E E A^(l-a Ctl , J);( „, J) )=2SpA( J E;-rl) 

u=l JEB U 

= s P A(£; - H) 2 + s P A(s - r) < s P A(£ - r) 2 + s P A(£ - r) (16) 

since (E - H) 2 = (E - T) 2 (E + T^y 2 < (E — T) 2 g. Calculating the traces, 
we obtain the right hand side of (16) as 



N 

E E A ^ 2 ~ 3 7(«,J),(u,J) +7( 2 „,j),(„,j) 



u=l jeb u 

- E i7(u,./),(«,j')i 2 + E E i7(u,j),K,j')i 2 ] 
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This quantity will not decrease if the range of J is enlarged to the full range D n 
and if 2 - ^( u ,j),(u,J) + lf u ,j),{u,j) is replaced with 2 - 27(„ i j) i(Ui j ) . Then we 



obtain 

N 



Per ~ A? E^ E A ^ 2 ~~ 2 7(«,J), + E l7(u,J),(u,J') 



2 

«=1 JeD" J':J'^J 



+ E E i7K,). (U '^')i 2 ]+ E A ">- 

u':u'^uJ'eB u / J<?B U 

Taking into account the definition (14) of 7( u .,/).(«'.j') and the fact that 
< e u j\e u j, >= for J ^ J', we can write the last inequality as 



1 N 

Per < ^{2Tr5„(7 - P) + Tr5„(/ - P)P U (7 - P) 



+ ^ TrP^PP^ + TrS u (I - P u )}. 

u' :u'^u 



The second term is less or equal than TrSVt (I — P) . Thus, finally 

N 

X 



1 N 

P er < ¥ E{3TrS 11 (/-F)+ ^ TrPS a PP u , + TrS u (I - P u )}. (17) 

u— 1 u' :u' 

6. The random coding. Let us assume that the words Ui,...,Ujv are 
chosen at random, independently and with the probability distribution (9) for 
each word. Then MS U = 5® n || and from (17), by independence of S U ,P U >, 

MP er < 3TrS® n (J -P) + (N- l)TrP5®"PMP lt , + MTrS^J - P„). 

By the inequalities (7), (10) and by the properties of trace, 

MP er < 4e+(iV-l)||P5® n P||TrMP tl >, 

for n > n(ir, e, S). By the definition of P, 

||p£«>np|| < 2~ n[H{8)-5]^ 

and by the definition of P u , 

TrMP u - = MTrP u - < MTrS u , ■ 2™[- ff ( s (>)+< 5 ] = 2»[- ff ( s (-))+< 5 l. 

Thus 

MP er < 4e + (JV - i)2-^ H ^- H< - s ^- 2S l (18) 
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Let us choose the distribution ir = tt° maximizing the entropy bound AH (it). 
Then (18) implies 

p(n, N) < 4e + (N - i)2-™[ A ^°)-2<5] (19) 

for n > n(n°, e, 8). Thus p(n, 2 n l AH (^- 3S ]) -> as n -> oo, whence AH(n°) - 
38 < C by (4) for arbitrary 8, and (5) follows. 
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Appendix. Let Ak,k — 1,2, be finite alphabets and let {S^,i 6 A k } be 
families of d. o. in Hilbert spaces TLk- Let {jtij} be a probability distribution 
on A 1 x A 2 and denote AfT({7r« }) = H(£ij mjSj ® S 2 ) - YV njH(S} ® S 2 ). 
We wish to prove 

max AF({7r ij }) = max A7?({tt 1 1 }) + max AH({tt 2 }). 

By the property of entropy 

H(S) <H{Tr 2 S)+H(TnS), 

where 5* is a d. o. in Tix ® Ti.2 and Tr^S 1 , fc = 1, 2, is partial trace with respect 
to Tit, proved in 0, we have 

ij i j 

where {%}}, {7if } are the marginal distributions of {Try}. It follows that 
maxAiJ({7r ii }) < max AH({nl}) + max Ai? ({7if }). 

The converse inequality follows by restricting to mj — n\ x it 2 and using the 
additivity of quantum entropy for product states. 
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